Missing Values in Dissimilarity-Based Classification of Multi-way Data

نویسندگان

  • Diana Porro-Muñoz
  • Robert P. W. Duin
  • Isneri Talavera-Bustamante
چکیده

Missing values can occur frequently in many real world situations. Such is the case of multi-way data applications, where objects are usually represented by arrays of 2 or more dimensions e.g. biomedical signals that can be represented as time-frequency matrices. This lack of attributes tends to influence the analysis of the data. In classification tasks for example, the performance of classifiers is usually deteriorated. Therefore, it is necessary to address this problem before classifiers are built. Although the absence of values is common in these types of data sets, there are just a few studies to tackle this problem for classification purposes. In this paper, we study two approaches to overcome the missing values problem in dissimilarity-based classification of multi-way data. Namely, imputation by factorization, and a modification of the previously proposed Continuous Multi-way Shape measure for comparing multi-way objects.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification of continuous multi-way data via dissimilarity representation

Proefschrift ter verkrijging van de graad van doctor aan de Technische Universiteit Delft; op gezag van de Rector Magnificus prof. 4 Missing values in dissimilarity-based classification of multi-way data 93 4. Summary 124 Samenvatting 125 Acknowledgments 126

متن کامل

An Evolutionary Multi-objective Discretization based on Normalized Cut

Learning models and related results depend on the quality of the input data. If raw data is not properly cleaned and structured, the results are tending to be incorrect. Therefore, discretization as one of the preprocessing techniques plays an important role in learning processes. The most important challenge in the discretization process is to reduce the number of features’ values. This operat...

متن کامل

The Dissimilarity Representation as a Tool for Three-Way Data Classification: A 2D Measure

The dissimilarity representation has demonstrated advantages in the solution of classification problems. Meanwhile, the representation of objects by multi-dimensional arrays is necessary in many research areas. However, the development of proper classification tools that take the multi-way structure into account is incipient. This paper introduces the use of the dissimilarity representation as ...

متن کامل

A Generalized Kernel Approach to Dissimilarity-based Classification

Usually, objects to be classified are represented by features. In this paper, we discuss an alternative object representation based on dissimilarity values. If such distances separate the classes well, the nearest neighbor method offers a good solution. However, dissimilarities used in practice are usually far from ideal and the performance of the nearest neighbor rule suffers from its sensitiv...

متن کامل

Combining Missing Data Imputation and Pattern Classification in a Multi-Layer Perceptron

Multi-Layer Perceptrons (MLPs) have been successfully applied in many pattern classification tasks. However, a drawback of these learning machines is that they cannot handle input vectors that present missing data on its features. A recommended way for dealing with missing values is imputation, i.e., to fill in missing data with plausible values. This paper presents a brief review of handling m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013